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Abstract. Modeling virtual humans that can exhibit realistic personalities is be¬ 
coming increasingly important as virtual humans are being widely used for in¬ 
ter-personal skills education. We present Virtual Human Personality Masks, a 
system that combines human computation with the idea of using existing virtual 
humans to bootstrap the creation of other virtual humans to enable quick and 
easy generation of perceivable verbal personalities in virtual humans. 

To evaluate this system, we created high and low verbosity-level variants of 
a virtual patient with symptoms of depression and conducted a user study with 
medical students split between two groups, each interacting with one of the two 
variants of the virtual patient. The participants’ perceived verbosity levels of the 
virtual patients indicated that not only did the virtual patients created using our 
system exhibit the intended personality in a perceivable manner, but also exhi¬ 
bited other related personality attributes in a manner that is consistent with the 
human personality theory analogs of verbosity. 

Keywords: Virtual Humans, Personalities, Verbosity, Human Computation, 
Conversational Agents, Virtual Patients. 

1 Introduction 

Virtual Humans (VHs) are increasingly being used as conversational partners in inter¬ 
personal skills education [1, 2], To increase usefulness, inter-personal skills education 
should provide interactions with VHs that exhibit realistic personalities. In this paper, 
we propose a method for rapidly creating variants of existing VHs that can exhibit 
perceivable personalities using a human computation based approach[3]. 

Existing models of personality for conversational agents often use fixed sets of per¬ 
sonality variables in combination with a decision algorithm to add verbal (the corpus) or 
non-verbal (gestures and emotions) personalities to the agent [4-9], One such approach 
models the personality variables as states in a finite state machine [4], Another approach 
uses a layered, Bayesian Belief Network to model personalities, moods and emotions 
[8]. Yet another approach uses multiple dimensions to represent a given personality 
state [9]. While these can be used to represent most generic personality states, the 
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amount of time required to apply such models to specific scenarios depends on the ex¬ 
tensiveness of the model creation process and the amount of refining required to apply 
the model to the scenario. Our approach aims at reducing these dependencies using 
personality masks created out of real humans. The personality masks are created by 
gathering different human responses to a subset of the stimuli from an existing stimulus- 
response corpus and can then be applied onto the original corpus, resulting in different 
personality variants of the original VH. We built our system on top of Virtual People 
Factory (VPF), an online application for users to create and interact with VHs [11]. 

To demonstrate the generic concept of using the Personality Masks system to gen¬ 
erate perceivable personalities, we created high and low verbosity masks and used the 
resulting verbosity-masked VHs to evaluate if the intended verbosity levels were per¬ 
ceivable to users. The results indicated that the VHs created using our system exhibit 
the intended verbosity levels in a manner that was perceivable for the interactants and 
consistent with the human personality theory analogs of verbosity. 

2 Virtual Human Personality Masks 

2.1 Verbosity and Patients with Depression 

Patients’ talkativeness(verbosity) levels in medical interviews can be influenced by 
factors such as their level of comfort with the interviewer, the environment, their age 
or the presence of personality disorders [12] and is often associated with extra version, 
one of the five factors in the Big Five theory of personalities [13]. Extraverted people 
are perceived as easier to interview, more willing to disclose information and more 
cooperative and extraversion in patients is directly related to their perception of social 
support [14], suggesting that it may influence the patients’ prognosis. This relevance 
of verbosity to the depression scenario and the fact that verbosity is easily perceivable 
and quantifiable, led us to choose it as the personality trait to be modeled for a 21- 
year old depressed VP, Cynthia Young (created using VPF) [15-18] using our system. 

The two phases involved in the system are described in the following sections. 

2.2 Mask Creation Phase 

In this phase, human responses to a subset of the stimuli in the corpus are used to 
create different personality masks for the intended personality. To achieve this, we 
apply the findings of Rossen et al. [10] by using an existing question-answering VH 
(a VP) to create a question-asking VH (a virtual doctor), using the set of stimuli (what 
the users can say to the VH) and responses (what the VH will respond to the user) 
from the original corpus. The question-asking VH (virtual doctor) is then used to 
gather the personality-specific responses that will form the personality masks for the 
intended personality. Below, we describe the steps for the Mask Creation phase. 

Step 1: Analysis of Interaction logs. Interaction logs of the existing corpus are analyzed 
to find the most frequently triggered stimuli. Since these stimuli are most likely to ap¬ 
pear in future interactions with the VH, the ability of the VH to exhibit the intended 
personality while responding to these stimuli is pivotal to the design of our system. 
The threshold value for extracting such stimuli defaults to 25% usage, and is customiza¬ 
ble by the domain expert. For the depression scenario, a set of approximately 100 
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stimulus-response pairs were filtered out from the original corpus with about 3500 sti¬ 
muli and 400 responses. 

Step 2: Review and refinement by domain expert. A domain expert reviews and re¬ 
fines the set of stimulus-response pairs filtered out in Step 1, using his/her expertise to 
determine which of the filtered stimuli can be used to elicit the intended personality 
trait. For example, the analysis of the interaction logs for our depressed VP corpus 
resulted in “What is your name?” as one of most frequently used stimuli, but is not 
useful in eliciting specifically long or short responses, not contributing to the creation 
of varied verbosity levels. For our example, 62 of the 100 most-frequently used stimu¬ 
lus-response pairs were selected by the domain expert for creating the virtual doctor. 

Step 3: Response hint generation by domain expert. Once the set of interview ques¬ 
tions are finalized, the domain expert fills-in response hints to guide the human inter¬ 
viewee to phrase their answers in a manner consistent with the original scenario. For 
example, the question “Have you had difficulty sleeping?” is open-ended and could 
be answered with excessive, normal or reduced sleep. However, in the original case 
history, the patient complained of excessive sleep and changing this key fact could 
completely change the diagnosis. The domain expert therefore added “Excessive 
sleep” as a response hint for that stimulus. 

Step 4: Gathering responses from human interviewees. After the virtual doctor is 
created, an online chat-style interview link is sent out to several human interviewees 
who present varied verbosity levels. Before starting the interview, a description of the 
role that the interviewee will play during the interview is provided. In our example, 
this description asks the interviewee to role-play as a 21-year old student with symp¬ 
toms of depression and talk to the virtual doctor as they would under such circums¬ 
tances. During the interview, the virtual doctor asks questions to the interviewee 
while providing response hints for each question. 

Step 5: Creation of Masks. Finally, the responses gathered from the human intervie¬ 
wees are cast into different personality masks by the domain expert using personality 
sliders that can be used to assign a “level” of the intended personality trait for each 
response. Fig. 1 shows an example of the verbosity slider for one of the 62 stimuli. 



Fig. 1. Verbosity sliders in the Personality Masks system for the depression example 


2.3 Mask Application Phase 

In this phase, the generated personality masks are applied onto the VH in the original 
corpus, resulting in a corpus that has personality-masked responses for some of the 
stimuli and default responses from the original corpus for the remaining stimuli. 
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3 Evaluation Study 

We conducted a between-subjects user study with verbosity level of the VP as the 
independent factor. The participants were medical students (n=31, mean age = 24.2 
years, 19 female and 12 male). They were randomly split into two groups and were 
assigned to interview either the high (n=16) or low (n=15) verbosity-level variant of 
the depressed VP in an online chat-style interaction. At the end of the interview, they 
completed a post-survey. Fig. 2 shows interaction transcripts for these conditions. 

The two verbosity masks were created from the responses provided by 11 female 
staff and residents from the medical school for the virtual doctor’s questions in online 
chat-style interviews that lasted about 20 minutes each. 



Timer: 0 hours 4 minutes 59 set 

Transcript 

1) Yon: Hello Ms. Young! 
Cvuthii Young: Hello 

2) Yon: How are you doing? 



Fig. 2. Interaction transcripts of low (left) and high (right) verbosity VP 

Primary Hypothesis. Participants in the low-verbosity group will perceive a lower 
level of verbosity in their VP than participants in the high-verbosity group. 

Secondary Hypothesis. Participants in the low-verbosity group will perceive their 
VP as less concerned, less cooperating, less willing to disclose information and less 
comfortable to deal with than the participants in the high-verbosity group. 

The metrics used in the post-survey for this study are summarized in Table 1. 


Table 1. Likert scale rating (1-Strongly Disagree/5-Strongly Agree) statements for all metrics 


Metric 

Likert scale rating statements 

Verbosity (Primary) 

“The patient I spoke to gave detailed or long answers to the questions 
that I asked her in the interview” 

Concern (Secondary) 

“I think the patient was quite concerned about her problem” 

Cooperation (Secondary) 

“I think the patient cooperated well with me in answering my ques- 

Self-disclosure (Secondary) 

“I think the patient was willing to disclose information about herself.” 

Comfort (Secondary) 

“I liked talking to the patient and felt like I was able to help her.” 


4 Results 

The mean Likert scores for each metric across the two treatment groups are shown 
in Fig. 3. Table 2 summarizes the results of the Mann-Whitney U tests on the Likert 
scores. Results show that the scores for perceived levels of verbosity, concern. 
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cooperation, self-disclosure and comfort from participants in the low-verbosity 
treatment group were significantly lower than the scores for perceived verbosity-level 
from participants in the high-verbosity treatment group. All differences were signifi¬ 
cant at p < 0.05 or less. Thus, we accept both our Hypotheses. 



Fig. 3. Mean Likert Scores for all metrics (Error bars represent standard errors) 


Table 2. Mann-Whitney U test statistics for all metrics 


Metric 

U 

z 

H 

p (one-tailed) 

Verbosity 

44.5 

-3.24 

0.58 

0.001 

Concern 

59.0 

-2.554 

0.002 

0.006 

Cooperation 

63.5 

-2.325 

0.42 

0.010 

Self-disclosure 

61.5 

-2.435 

0.44 

0.007 

Comfort 

79-5 

-1.733 

0.31 

0.047 


Table 3 summarizes the user comments, reinforcing the results from the post-survey. 


Table 3. Examples of participants 1 responses to open-ended questions 


Question Topic 

High Verbosity Group 

Low Verbosity Group 


“It was a very accurate experience. I 
enjoyed the activity.” 

“It was very hard to ask further ques¬ 
tions on a subject when she gave short 
or avoiding answers.” 


“I felt like I wanted to speak more 
personally with hei” 

“Real patients are more forthcoming 
when you ask them questions such as 
‘why did you come here today?’.” 

Personality of 
the VP 

“Very typical depressed mood, but 
eager to get back to feeling like 
herself.” 

“She seemed aware that there was a 
change in her behavior, but was hesitant 
to delve into the root of her issues.” 


“She seemed very open & willing to 
talk to me even though she was 
depressed.” 

“Dry- perhaps she was simply depressed 
and down and having a hard time get¬ 
ting up the motivation to explain her¬ 
self 
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5 Discussion 

Analysis of the results of our study indicates that participants were able to perceive 
the intended verbosity levels in the VPs they interviewed. The results for concern, 
cooperation, self-disclosure and comfort were also congruent with the Big Five theory 
of personality characteristics in which talkativeness and sociability are frequently 
associated with extraversion [13, 16]. These results also showed that our system has 
been successful in shifting the effort from modeling standardized personalities to 
modeling perceivable personalities. 

However, we also observed that the participants’ perceived verbosity-level of the 
high-verbosity VP were not as high as we expected them to be, even though the dif¬ 
ference was statistically significant. One potential reason for this is the observation 
that the high-verbosity VP might not have been a considerable deviation from regular 
patients, as inferred from some of the previous work, that assert the fact that humans 
respond better to computer-based agents that are highly exaggerated [19]. 

6 Conclusions and Future Work 

We have presented Virtual Human Personality Masks, a system that combines the 
concepts of human computation [3] with the idea of using existing VHs to bootstrap 
the creation of other VHs [10], to rapidly generate perceivable personalities in VHs. 
Below, we present the future directions for this research as identified by our expe¬ 
riences in designing the Personality Masks system and our understanding of potential 
applications for this system. 

We aim to extend our work to model more complex personalities, such as the per¬ 
sonality traits listed in the Big Five Personality model (OCEAN model) [15], From 
our experiences with modeling verbosity as a personality trait for our depressed VP, 
we identified that modeling complex personalities would need inputs from other 
sources as well. In the case of our verbosity example, both the review and refinement 
of stimuli for the role-reversed VH and the response hint generation were done by the 
domain expert. However, for more complex personalities, psychology experts might 
provide guidelines on selecting the stimuli that can elicit the intended personality, 
while the domain experts provide response hints for each of those stimuli. 

We also intend to explore the learning effects of using the personality-masked VHs 
in interpersonal skills education by running studies that measure the experience 
gained by interacting with VHs of different personalities. 

We will also run follow-up studies to compare the extent of personality perception 
across different scenarios - for example, comparing the perception of personalities in 
the depression scenario (which needs a high-level of emotional connection between 
the patient and the doctor) with those in a Cranial Nerve disorder scenario (which 
mostly involves physical exams and less verbal interaction between the patient and 
the doctor). Through this study, we will aim to establish a relationship between the 
type of the scenario and the extent to which virtual human personalities would affect 
the learning of interpersonal skills in the scenario. 
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